A Mixed Model for Cross Lingual Opinion Analysis

نویسندگان

Lin Gui

Ruifeng Xu

Jun Xu

Li Yuan

Yuanlin Yao

Jiyun Zhou

Qiaoyun Qiu

Shuwei Wang

Kam-Fai Wong

Ricky Cheung

چکیده

The performances of machine learning based opinion analysis systems are always puzzled by the insufficient training opinion corpus. Such problem becomes more serious for the resource-poor languages. Thus, the cross-lingual opinion analysis (CLOA) technique, which leverages opinion resources on one (source) language to another (target) language for improving the opinion analysis on target language, attracts more research interests. Currently, the transfer learning based CLOA approach sometimes falls to over fitting on single language resource, while the performance of the co-training based CLOA approach always achieves limited improvement during bi-lingual decision. Target to these problems, in this study, we propose a mixed CLOA model, which estimates the confidence of each monolingual opinion analysis system by using their training errors through bilingual transfer self-training and co-training, respectively. By using the weighted average distances between samples and classification hyper-planes as the confidence, the opinion polarity of testing samples are classified. The evaluations on NLP&CC 2013 CLOA bakeoff dataset show that this approach achieves the best performance, which outperforms transfer learning and co-training based approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Instance Level Transfer Learning for Cross Lingual Opinion Analysis

This paper presents two instance-level transfer learning based algorithms for cross lingual opinion analysis by transferring useful translated opinion examples from other languages as the supplementary training data for improving the opinion classifier in target language. Starting from the union of small training data in target language and large translated examples in other languages, the Tran...

متن کامل

Aligning Opinions: Cross-Lingual Opinion Mining with Dependencies

We propose a cross-lingual framework for fine-grained opinion mining using bitext projection. The only requirements are a running system in a source language and word-aligned parallel data. Our method projects opinion frames from the source to the target language, and then trains a system on the target language using the automatic annotations. Key to our approach is a novel dependency-based mod...

متن کامل

Overview of Multilingual Opinion Analysis Task at NTCIR-8: A Step Toward Cross Lingual Opinion Analysis

متن کامل

A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining

We present the Trip-MAML dataset, a Multi-Lingual dataset of hotel reviews that have been manually annotated at the sentence-level with Multi-Aspect sentiment labels. This dataset has been built as an extension of an existent English-only dataset, adding documents written in Italian and Spanish. We detail the dataset construction process, covering the data gathering, selection, and annotation. ...

متن کامل

English-Persian Plagiarism Detection based on a Semantic Approach

Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

A Mixed Model for Cross Lingual Opinion Analysis

نویسندگان

چکیده

منابع مشابه

Instance Level Transfer Learning for Cross Lingual Opinion Analysis

Aligning Opinions: Cross-Lingual Opinion Mining with Dependencies

Overview of Multilingual Opinion Analysis Task at NTCIR-8: A Step Toward Cross Lingual Opinion Analysis

A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining

English-Persian Plagiarism Detection based on a Semantic Approach

عنوان ژورنال:

اشتراک گذاری